Overview

Brought to you by YData

Dataset statistics

Number of variables18
Number of observations891
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory82.0 KiB
Average record size in memory94.3 B

Variable types

Numeric7
Categorical3
Text3
Boolean5

Alerts

embarked_Q is highly imbalanced (57.6%)Imbalance
fare_log has 15 (1.7%) infinite valuesInfinite
passengerid is uniformly distributedUniform
passengerid has unique valuesUnique
sibsp has 608 (68.2%) zerosZeros
parch has 678 (76.1%) zerosZeros
fare has 15 (1.7%) zerosZeros

Reproduction

Analysis started2024-07-25 16:43:26.582766
Analysis finished2024-07-25 16:43:32.246122
Duration5.66 seconds
Software versionydata-profiling vv4.9.0
Download configurationconfig.json

Variables

passengerid
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:32.351889image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.35384
Coefficient of variation (CV)0.57702655
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotonicityStrictly increasing
2024-07-25T21:43:32.533621image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
599 1
 
0.1%
588 1
 
0.1%
589 1
 
0.1%
590 1
 
0.1%
591 1
 
0.1%
592 1
 
0.1%
593 1
 
0.1%
594 1
 
0.1%
595 1
 
0.1%
Other values (881) 881
98.9%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
891 1
0.1%
890 1
0.1%
889 1
0.1%
888 1
0.1%
887 1
0.1%
886 1
0.1%
885 1
0.1%
884 1
0.1%
883 1
0.1%
882 1
0.1%

survived
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
0
549 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Length

2024-07-25T21:43:32.683225image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-25T21:43:32.783162image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring characters

ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 549
61.6%
1 342
38.4%

pclass
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
3
491 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Length

2024-07-25T21:43:32.875904image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-25T21:43:32.947731image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 491
55.1%
1 216
24.2%
2 184
 
20.7%

title
Text

Distinct661
Distinct (%)74.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:33.132065image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length23
Median length16
Mean length7.6049383
Min length2

Characters and Unicode

Total characters6776
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique525 ?
Unique (%)58.9%

Sample

1st rowBraund,
2nd rowCumings,
3rd rowHeikkinen,
4th rowFutrelle,
5th rowAllen,
ValueCountFrequency (%)
andersson 9
 
1.0%
sage 7
 
0.8%
goodwin 6
 
0.7%
johnson 6
 
0.7%
skoog 6
 
0.7%
panula 6
 
0.7%
carter 6
 
0.7%
van 6
 
0.7%
rice 5
 
0.6%
palsson 4
 
0.4%
Other values (650) 830
93.2%
2024-07-25T21:43:33.398636image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 867
 
12.8%
e 563
 
8.3%
n 526
 
7.8%
a 512
 
7.6%
o 436
 
6.4%
r 414
 
6.1%
l 350
 
5.2%
s 329
 
4.9%
i 316
 
4.7%
t 212
 
3.1%
Other values (44) 2251
33.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4969
73.3%
Uppercase Letter 919
 
13.6%
Other Punctuation 876
 
12.9%
Dash Punctuation 12
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 563
11.3%
n 526
10.6%
a 512
10.3%
o 436
 
8.8%
r 414
 
8.3%
l 350
 
7.0%
s 329
 
6.6%
i 316
 
6.4%
t 212
 
4.3%
d 157
 
3.2%
Other values (16) 1154
23.2%
Uppercase Letter
ValueCountFrequency (%)
S 89
 
9.7%
B 77
 
8.4%
M 76
 
8.3%
C 75
 
8.2%
H 70
 
7.6%
A 51
 
5.5%
L 50
 
5.4%
P 47
 
5.1%
G 44
 
4.8%
R 42
 
4.6%
Other values (15) 298
32.4%
Other Punctuation
ValueCountFrequency (%)
, 867
99.0%
' 9
 
1.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5888
86.9%
Common 888
 
13.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 563
 
9.6%
n 526
 
8.9%
a 512
 
8.7%
o 436
 
7.4%
r 414
 
7.0%
l 350
 
5.9%
s 329
 
5.6%
i 316
 
5.4%
t 212
 
3.6%
d 157
 
2.7%
Other values (41) 2073
35.2%
Common
ValueCountFrequency (%)
, 867
97.6%
- 12
 
1.4%
' 9
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6776
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 867
 
12.8%
e 563
 
8.3%
n 526
 
7.8%
a 512
 
7.6%
o 436
 
6.4%
r 414
 
6.1%
l 350
 
5.2%
s 329
 
4.9%
i 316
 
4.7%
t 212
 
3.1%
Other values (44) 2251
33.2%

name
Text

Distinct804
Distinct (%)90.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:33.700865image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length74
Median length45
Mean length18.360269
Min length7

Characters and Unicode

Total characters16359
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique750 ?
Unique (%)84.2%

Sample

1st rowMr. Owen Harris
2nd rowMrs. John Bradley (Florence Briggs Thayer)
3rd rowMiss. Laina
4th rowMrs. Jacques Heath (Lily May Peel)
5th rowMr. William Henry
ValueCountFrequency (%)
mr 521
 
19.1%
miss 182
 
6.7%
mrs 129
 
4.7%
william 64
 
2.3%
john 44
 
1.6%
master 40
 
1.5%
henry 34
 
1.2%
james 24
 
0.9%
george 24
 
0.9%
charles 23
 
0.8%
Other values (916) 1648
60.3%
2024-07-25T21:43:34.028867image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1844
 
11.3%
r 1544
 
9.4%
a 1145
 
7.0%
e 1140
 
7.0%
M 1052
 
6.4%
i 1009
 
6.2%
s 968
 
5.9%
. 892
 
5.5%
n 778
 
4.8%
l 717
 
4.4%
Other values (49) 5270
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10477
64.0%
Uppercase Letter 2726
 
16.7%
Space Separator 1844
 
11.3%
Other Punctuation 1023
 
6.3%
Close Punctuation 144
 
0.9%
Open Punctuation 144
 
0.9%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1544
14.7%
a 1145
10.9%
e 1140
10.9%
i 1009
9.6%
s 968
9.2%
n 778
7.4%
l 717
6.8%
o 572
 
5.5%
t 455
 
4.3%
h 364
 
3.5%
Other values (16) 1785
17.0%
Uppercase Letter
ValueCountFrequency (%)
M 1052
38.6%
A 199
 
7.3%
J 184
 
6.7%
E 153
 
5.6%
H 133
 
4.9%
W 110
 
4.0%
C 97
 
3.6%
S 91
 
3.3%
L 79
 
2.9%
G 70
 
2.6%
Other values (15) 558
20.5%
Other Punctuation
ValueCountFrequency (%)
. 892
87.2%
" 106
 
10.4%
, 24
 
2.3%
/ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
1844
100.0%
Close Punctuation
ValueCountFrequency (%)
) 144
100.0%
Open Punctuation
ValueCountFrequency (%)
( 144
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13203
80.7%
Common 3156
 
19.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1544
11.7%
a 1145
 
8.7%
e 1140
 
8.6%
M 1052
 
8.0%
i 1009
 
7.6%
s 968
 
7.3%
n 778
 
5.9%
l 717
 
5.4%
o 572
 
4.3%
t 455
 
3.4%
Other values (41) 3823
29.0%
Common
ValueCountFrequency (%)
1844
58.4%
. 892
28.3%
) 144
 
4.6%
( 144
 
4.6%
" 106
 
3.4%
, 24
 
0.8%
- 1
 
< 0.1%
/ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16359
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1844
 
11.3%
r 1544
 
9.4%
a 1145
 
7.0%
e 1140
 
7.0%
M 1052
 
6.4%
i 1009
 
6.2%
s 968
 
5.9%
. 892
 
5.5%
n 778
 
4.8%
l 717
 
4.4%
Other values (49) 5270
32.2%

age
Real number (ℝ)

Distinct71
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.544332
Minimum0
Maximum80
Zeros7
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2024-07-25T21:43:34.155673image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q122
median29
Q335
95-th percentile54
Maximum80
Range80
Interquartile range (IQR)13

Descriptive statistics

Standard deviation13.013778
Coefficient of variation (CV)0.44048308
Kurtosis0.98658675
Mean29.544332
Median Absolute Deviation (MAD)7
Skewness0.45956263
Sum26324
Variance169.35843
MonotonicityNot monotonic
2024-07-25T21:43:34.273990image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29 197
22.1%
24 31
 
3.5%
22 27
 
3.0%
28 27
 
3.0%
30 27
 
3.0%
18 26
 
2.9%
19 25
 
2.8%
21 24
 
2.7%
36 23
 
2.6%
25 23
 
2.6%
Other values (61) 461
51.7%
ValueCountFrequency (%)
0 7
0.8%
1 7
0.8%
2 10
1.1%
3 6
0.7%
4 10
1.1%
5 4
 
0.4%
6 3
 
0.3%
7 3
 
0.3%
8 4
 
0.4%
9 8
0.9%
ValueCountFrequency (%)
80 1
 
0.1%
74 1
 
0.1%
71 2
0.2%
70 3
0.3%
66 1
 
0.1%
65 3
0.3%
64 2
0.2%
63 2
0.2%
62 4
0.4%
61 3
0.3%

sibsp
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52300786
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:34.359688image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1027434
Coefficient of variation (CV)2.1084644
Kurtosis17.88042
Mean0.52300786
Median Absolute Deviation (MAD)0
Skewness3.6953517
Sum466
Variance1.2160431
MonotonicityNot monotonic
2024-07-25T21:43:34.464440image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 608
68.2%
1 209
 
23.5%
2 28
 
3.1%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
8 7
 
0.8%
5 5
 
0.6%
4 18
 
2.0%
3 16
 
1.8%
2 28
 
3.1%
1 209
 
23.5%
0 608
68.2%

parch
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38159371
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:34.543203image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80605722
Coefficient of variation (CV)2.1123441
Kurtosis9.7781252
Mean0.38159371
Median Absolute Deviation (MAD)0
Skewness2.749117
Sum340
Variance0.64972824
MonotonicityNot monotonic
2024-07-25T21:43:34.644295image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.4%
6 1
 
0.1%
ValueCountFrequency (%)
0 678
76.1%
1 118
 
13.2%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.4%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.4%
3 5
 
0.6%
2 80
 
9.0%
1 118
 
13.2%
0 678
76.1%

ticket
Text

Distinct681
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:34.867989image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length18
Median length17
Mean length6.7508418
Min length3

Characters and Unicode

Total characters6015
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique547 ?
Unique (%)61.4%

Sample

1st rowA/5 21171
2nd rowPC 17599
3rd rowSTON/O2. 3101282
4th row113803
5th row373450
ValueCountFrequency (%)
pc 60
 
5.3%
c.a 27
 
2.4%
a/5 17
 
1.5%
ca 14
 
1.2%
ston/o 12
 
1.1%
2 12
 
1.1%
sc/paris 9
 
0.8%
w./c 9
 
0.8%
soton/o.q 8
 
0.7%
347082 7
 
0.6%
Other values (709) 955
84.5%
2024-07-25T21:43:35.230536image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4808
79.9%
Uppercase Letter 652
 
10.8%
Other Punctuation 295
 
4.9%
Space Separator 239
 
4.0%
Lowercase Letter 21
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 151
23.2%
O 100
15.3%
P 98
15.0%
A 82
12.6%
S 74
11.3%
N 40
 
6.1%
T 36
 
5.5%
W 16
 
2.5%
Q 15
 
2.3%
I 11
 
1.7%
Other values (6) 29
 
4.4%
Decimal Number
ValueCountFrequency (%)
3 746
15.5%
1 689
14.3%
2 594
12.4%
7 490
10.2%
4 464
9.7%
6 422
8.8%
0 406
8.4%
5 387
8.0%
9 328
6.8%
8 282
 
5.9%
Lowercase Letter
ValueCountFrequency (%)
a 6
28.6%
s 5
23.8%
r 4
19.0%
i 4
19.0%
l 1
 
4.8%
e 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 197
66.8%
/ 98
33.2%
Space Separator
ValueCountFrequency (%)
239
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5342
88.8%
Latin 673
 
11.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 151
22.4%
O 100
14.9%
P 98
14.6%
A 82
12.2%
S 74
11.0%
N 40
 
5.9%
T 36
 
5.3%
W 16
 
2.4%
Q 15
 
2.2%
I 11
 
1.6%
Other values (12) 50
 
7.4%
Common
ValueCountFrequency (%)
3 746
14.0%
1 689
12.9%
2 594
11.1%
7 490
9.2%
4 464
8.7%
6 422
7.9%
0 406
7.6%
5 387
7.2%
9 328
6.1%
8 282
 
5.3%
Other values (3) 534
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6015
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 746
12.4%
1 689
11.5%
2 594
9.9%
7 490
8.1%
4 464
 
7.7%
6 422
 
7.0%
0 406
 
6.7%
5 387
 
6.4%
9 328
 
5.5%
8 282
 
4.7%
Other values (25) 1207
20.1%

fare
Real number (ℝ)

ZEROS 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.204208
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:35.380433image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.693429
Coefficient of variation (CV)1.5430725
Kurtosis33.398141
Mean32.204208
Median Absolute Deviation (MAD)6.9042
Skewness4.7873165
Sum28693.949
Variance2469.4368
MonotonicityNot monotonic
2024-07-25T21:43:35.534342image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 38
 
4.3%
7.75 34
 
3.8%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
0 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.4%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.4%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

sex_female
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1023.0 B
False
577 
True
314 
ValueCountFrequency (%)
False 577
64.8%
True 314
35.2%
2024-07-25T21:43:35.630917image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

sex_male
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1023.0 B
True
577 
False
314 
ValueCountFrequency (%)
True 577
64.8%
False 314
35.2%
2024-07-25T21:43:35.711083image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

embarked_C
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1023.0 B
False
723 
True
168 
ValueCountFrequency (%)
False 723
81.1%
True 168
 
18.9%
2024-07-25T21:43:35.793365image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

embarked_Q
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1023.0 B
False
814 
True
 
77
ValueCountFrequency (%)
False 814
91.4%
True 77
 
8.6%
2024-07-25T21:43:35.870840image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

embarked_S
Boolean

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1023.0 B
True
646 
False
245 
ValueCountFrequency (%)
True 646
72.5%
False 245
 
27.5%
2024-07-25T21:43:35.955426image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

familysize
Real number (ℝ)

Distinct9
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9046016
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-07-25T21:43:36.060563image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile6
Maximum11
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.6134585
Coefficient of variation (CV)0.84713704
Kurtosis9.159666
Mean1.9046016
Median Absolute Deviation (MAD)0
Skewness2.7274415
Sum1697
Variance2.6032485
MonotonicityNot monotonic
2024-07-25T21:43:36.255921image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 537
60.3%
2 161
 
18.1%
3 102
 
11.4%
4 29
 
3.3%
6 22
 
2.5%
5 15
 
1.7%
7 12
 
1.3%
11 7
 
0.8%
8 6
 
0.7%
ValueCountFrequency (%)
1 537
60.3%
2 161
 
18.1%
3 102
 
11.4%
4 29
 
3.3%
5 15
 
1.7%
6 22
 
2.5%
7 12
 
1.3%
8 6
 
0.7%
11 7
 
0.8%
ValueCountFrequency (%)
11 7
 
0.8%
8 6
 
0.7%
7 12
 
1.3%
6 22
 
2.5%
5 15
 
1.7%
4 29
 
3.3%
3 102
 
11.4%
2 161
 
18.1%
1 537
60.3%

isalone
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size7.1 KiB
1
537 
0
354 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1 537
60.3%
0 354
39.7%

Length

2024-07-25T21:43:36.398497image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-25T21:43:36.531932image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
1 537
60.3%
0 354
39.7%

Most occurring characters

ValueCountFrequency (%)
1 537
60.3%
0 354
39.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 891
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 537
60.3%
0 354
39.7%

Most occurring scripts

ValueCountFrequency (%)
Common 891
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 537
60.3%
0 354
39.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 891
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 537
60.3%
0 354
39.7%

fare_log
Real number (ℝ)

INFINITE 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite15
Infinite (%)1.7%
Mean-inf
Minimum-inf
Maximum2.7095491
Zeros0
Zeros (%)0.0%
Negative15
Negative (%)1.7%
Memory size7.1 KiB
2024-07-25T21:43:36.650824image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum-inf
5-th percentile0.85883785
Q10.89819771
median1.1599941
Q31.4913617
95-th percentile2.0495001
Maximum2.7095491
Rangeinf
Interquartile range (IQR)0.59316399

Descriptive statistics

Standard deviationnan
Coefficient of variation (CV)nan
Kurtosisnan
Mean-inf
Median Absolute Deviation (MAD)0.2648921
Skewnessnan
Sum-inf
Variancenan
MonotonicityNot monotonic
2024-07-25T21:43:36.783195image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.9057958804 43
 
4.8%
1.113943352 42
 
4.7%
0.8973961392 38
 
4.3%
0.8893017025 34
 
3.8%
1.414973348 31
 
3.5%
1.021189299 24
 
2.7%
0.8989992709 18
 
2.0%
0.8907003977 16
 
1.8%
0.8590902399 15
 
1.7%
-inf 15
 
1.7%
Other values (238) 615
69.0%
ValueCountFrequency (%)
-inf 15
1.7%
0.6034150454 1
 
0.1%
0.6989700043 1
 
0.1%
0.7950105586 1
 
0.1%
0.808717242 1
 
0.1%
0.8095597146 1
 
0.1%
0.8126326449 2
 
0.2%
0.8293037728 2
 
0.2%
0.8362164784 1
 
0.1%
0.8419848046 1
 
0.1%
ValueCountFrequency (%)
2.709549109 3
0.3%
2.419955748 4
0.4%
2.418922452 2
0.2%
2.3936117 2
0.2%
2.357029123 4
0.4%
2.345920813 1
 
0.1%
2.325310372 1
 
0.1%
2.324976566 3
0.3%
2.217132945 2
0.2%
2.186002269 3
0.3%

Interactions

2024-07-25T21:43:30.474791image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:26.849510image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.434642image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.127423image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.770589image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.364084image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.913910image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.546588image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:26.928581image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.535059image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.205466image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.866769image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.429832image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.000075image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.626116image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.007472image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.634650image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.303539image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.944921image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.526884image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.077481image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.728542image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.095711image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.744874image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.395357image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.029886image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.613767image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.163518image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.811213image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.180867image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.827285image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.469656image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.115689image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.695190image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.252957image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:31.708997image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.272279image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.912422image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.569928image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.195703image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.772232image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.330624image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:31.794510image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:27.370725image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.029251image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:28.679167image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.281732image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:29.842270image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-07-25T21:43:30.397299image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Missing values

2024-07-25T21:43:31.926442image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-25T21:43:32.143602image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

passengeridsurvivedpclasstitlenameagesibspparchticketfaresex_femalesex_maleembarked_Cembarked_Qembarked_Sfamilysizeisalonefare_log
0103Braund,Mr. Owen Harris2210A/5 211717.2500FalseTrueFalseFalseTrue200.860338
1211Cumings,Mrs. John Bradley (Florence Briggs Thayer)3810PC 1759971.2833TrueFalseTrueFalseFalse201.852988
2313Heikkinen,Miss. Laina2600STON/O2. 31012827.9250TrueFalseFalseFalseTrue110.898999
3411Futrelle,Mrs. Jacques Heath (Lily May Peel)351011380353.1000TrueFalseFalseFalseTrue201.725095
4503Allen,Mr. William Henry35003734508.0500FalseTrueFalseFalseTrue110.905796
5603Moran,Mr. James29003308778.4583FalseTrueFalseTrueFalse110.927283
6701McCarthy,Mr. Timothy J54001746351.8625FalseTrueFalseFalseTrue111.714853
7803Palsson,Master. Gosta Leonard23134990921.0750FalseTrueFalseFalseTrue501.323768
8913Johnson,Mrs. Oscar W (Elisabeth Vilhelmina Berg)270234774211.1333TrueFalseFalseFalseTrue301.046624
91012Nasser,Mrs. Nicholas (Adele Achem)141023773630.0708TrueFalseTrueFalseFalse201.478145
passengeridsurvivedpclasstitlenameagesibspparchticketfaresex_femalesex_maleembarked_Cembarked_Qembarked_Sfamilysizeisalonefare_log
88188203Markun,Mr. Johann33003492577.8958FalseTrueFalseFalseTrue110.897396
88288303Dahlberg,Miss. Gerda Ulrika2200755210.5167TrueFalseFalseFalseTrue111.021879
88388402Banfield,Mr. Frederick James2800C.A./SOTON 3406810.5000FalseTrueFalseFalseTrue111.021189
88488503Sutehall,Mr. Henry Jr2500SOTON/OQ 3920767.0500FalseTrueFalseFalseTrue110.848189
88588603Rice,Mrs. William (Margaret Norton)390538265229.1250TrueFalseFalseTrueFalse601.464266
88688702Montvila,Rev. Juozas270021153613.0000FalseTrueFalseFalseTrue111.113943
88788811Graham,Miss. Margaret Edith190011205330.0000TrueFalseFalseFalseTrue111.477121
88888903Johnston,Miss. Catherine Helen "Carrie"2912W./C. 660723.4500TrueFalseFalseFalseTrue401.370143
88989011Behr,Mr. Karl Howell260011136930.0000FalseTrueTrueFalseFalse111.477121
89089103Dooley,Mr. Patrick32003703767.7500FalseTrueFalseTrueFalse110.889302